Robust Methods for Mean and Covariance Structure Analysis
نویسنده
چکیده
Covariance structure analysis plays an important role in social and behavioral sciences to evaluate hypothesized in uences among unmeasured latent and observed variables. Existing methods for analyzing these data rely on unstructured sample means and covariances estimated under normality, and evaluate a proposed structural model using statistical theory based on normal theory MLE and generalized least squares (GLS) with a weight matrix obtained from inverting a matrix based on sample fourth moments and covariances. Since the in uence functions associated with these methods are quadratic, a few outliers can make these classical procedures a total failure. Considering that data collected in social and behavioral sciences are not so accurate, some robust methods are necessary in estimation and testing. Even though the theory for robustly estimating multivariate location and scatter has been developed extensively, very little has been accomplished in robust mean and covariance structure analysis. While robust principal components and canonical variates have been described many years ago, this methodology is essentially exploratory in nature and does not provide tests of model t nor the covariance matrix of the estimator that are essential to covariance structure analysis. In this paper, several robust methods in model tting and testing are proposed. These include direct estimation of M-estimators of structured parameters and a two-stage procedure based on robust Mand S-estimators of population covariances. The large sample properties of these estimators are obtained. The equivalence between a direct M-estimator and a two-stage estimator based on an M-estimator of population covariance is established when sampling from an elliptical distribution. Two test statistics are presented in judging the adequency of a hypothesized model; both are asymptotically distribution free if using distribution free weight matrices. So these test statistics possess both small sample and large sample robustness. The two-stage procedures can be easily adapted into standard software packages by modifying existing GLS procedures. To demonstrate the easy application of the two-stage procedure, M-estimators under six di erent weight functions are calculated for a real data set. All the weight functions give the smallest weight to the case which has been formerly identi ed as the most in uential point. KEY WORDS: Outliers, M-estimators, S-estimators, Two-stage estimators; Robust Tests, Elliptical theory, Distribution-free. 1 Introduction Mean and covariance structure analysis plays a very important role in understanding the underlying structure of social and multivariate data and in dimension reduction (Bentler and Dudgeon 1996). A special case of it is covariance structure analysis in which the unknown mean is a nuisance parameter. In a typical covariance structure analysis setting, we have a random sample X1, : : :, Xn with EXi = 0 and var(Xi) = 0 = ( 0). A classical application is the con rmatory factor analysis model = T + , where the elements of , , and are elements of the parameter vector 0. The primary interest is in getting a good estimator of 0 and in judging the adequancy of the model = ( ). Because of the complexity of the data, a variety of methods have been proposed to estimate 0 and evaluate the structural model. The most well known and often used method is classical normal theory maximum likelihood (ML). In this approach, the normal theory likelihood based on the unstructured ML sample covariance S is maximized with respect to to obtain the structured estimator ̂ = (̂n). Since the underlying data may not be normal, a generalized least squares method called asymptotically distribution free (ADF) was proposed by Browne (1982, 1984) and Chamberlain (1982). The idea in ADF is to t ( ) by S using an asymptotically correct weight matrix based on the sample fourth moments. Even though ADF gives correct inference when sample size is very large, say 5000, for small to medium sample sizes, ADF usually rejects the true model too often (Hu, Bentler and Kano, 1992). Yuan and Bentler (1995) proposed using a residual based regression approach. One variant of this approach e ectively corrects Browne's (1984) ADF test statistic in such a way as to perform much better in empirical simulations. 1 All the above mentioned methods assume that the data are good, and model the sample covariance S by ( ). In practice, real data may not be well-behaved. Hence, the standard methods may be susceptible to substantially worse performance than shown in simulations which have relied upon smooth and well-behaved data generation mechanisms. In a regular case, it will be appropriate to base the analysis on the sample covariance S based on the standard formula in which every observation is treated with equal weight. However, when the data generating mechanism is not wellbehaved, S may not be an adequate unstructured estimator of . Since the in uence function associated with S is a quadratic function, one single case with gross error or an outlier can totally distort the sample covariance. Devlin, Gnanadesikan and Kettenring (1981) gave such an example in principal component analysis. Campbell (1980) also considered the e ect of outliers on estimated covariances through examples. In the setting of covariance structure analysis, Tanaka, Watadani and Moon (1991) and Lee and Wang (1995) considered identifying the most in uential points. Using outlier detection to eliminate cases before computing the sample covariance S hopefully would improve the performance of normal theory ML or ADF. As discussed in Huber (1981, pp. 4-5), however, outlier rejection followed by a classical method may not be a good statistical procedure. For example, if the data is from a long tailed distributions, the most in uential points may not be outliers. After such points have been identi ed, a decision about them thus remains hard to make. Furthermore, theoretical properties of such a procedure are not clear. A natural approach would thus be to use robust statistics in structural models. In the literature on estimating the population mean and the population covariance matrix, a variety of robust procedures have been proposed. Examples are: various 2 types of M-estimators investigated by Maronna (1976), Huber (1977), Tyler (1983, 1987), the Stahel-Donoho estimator independently proposed by Stahel (1981) and Donoho (1982), the minimum distance estimator investigated by Tamura and Boos (1986) and Donoho and Liu (1988a, 1988b), the minimum covariance determinant (MCD) estimator and the minimum volume ellipsoid (MVE) estimator introduced by Rousseeuw (1983), the type of S-estimators investigated by Davies (1987) and Lopuha a (1989), the -estimators introduced by Lopuhaa (1991a), and many others. Applications of robust procedures in principal component analysis, canonical variates, and related correlational problems have been considered by Devlin et al (1981), Campbell (1980, 1982), and others (e.g., Wilcox 1994). In this paper, we will consider using a robust procedure to estimate 0 and to evaluate the general structure. In some sense, our work is an extension and application of the theory developed in estimating unstructured means and covariances and parallels the work of Devlin et al and Campbell. However, the procedures considered by Devlin et al and Campbell are basically exploratory in nature. In covariance structure analysis, on the other hand, the emphasis is on evaluating a speci c hypothesized structure ( ) and evaluating the signi cance of the parameter estimates. These problems require the use of a valid test of model structure ( ) and a coordinated method for determining the covariance matrix of the estimated ̂n. Hence, previously developed robust methods for exploratory data analysis have to be extended to con rmatory mean and covariance structure analysis. We will use the following notations: If A is a p p matrix, vec(A) is the p2dimensional vector formed by stacking the columns of A while vech(A) is a p = p(p+ 1)=2-dimensional vector formed by the nonduplicated elements of A when A is symmetric. ( ) = vech( ( )) and ( ) = vec( ( )). When we refer to unstructured means and covariances, = ( T ; vechT( ))T will be used. A function with a dot on top means derivatives, e.g. _ ( ) = @ =@ T , _ h (x; ) = @h(x; )=@ T . When a function 3 is evaluated at the population value, we often omit its argument, e.g. _ = _ ( 0). We will consider the M-estimator of a structured parameter in Section 2. In Section 3, we give a summary of the properties of the existing robust estimators. The S-estimators of multivariate mean and covariance will be detailed, especially the di erent estimators of the asymptotic covariance under di erent assumptions on the populations. Section 4 will present a two stage estimating process based on estimators of the population covariance. In Section 5, we consider two test statistics in judging the adequacy of a structured model. All the proofs of the theorems are in the appendix. A real data example will be considered in Section 6. 2 M-estimator Let X1, : : :, Xn be a sample of a p-dimensional random vector X. In this section, we will develop an M-estimator of the structured parameter of a mean and covariance based on such a sample. We consider both structured means and structured covariances in the rst part, and structured covariances with unstructured means in the second part. Although it is not our main emphasis, when assuming the sample is from an elliptical distribution, the asymptotic covariance involved can be substantially simpli ed. A distribution with density f(x) = j 0j 1 2hf(x 0)T 1 0 (x 0)g, where h(t) is a function independent of p, is refered to as an elliptical distribution or elliptical symmetric distribution (Fang, Kotz and Ng, 1990). 2.1 Structured mean and covariance Maronna (1976) de ned an M-estimator of 0 and 0. Now we have a structure 4 0 = ( 0) and 0 = ( 0). Following the maximum likelihood estimator of 0 in an elliptical distribution and the M-estimator of 0 and 0 de ned by Maronna (1976), we de ne an M-estimator ̂n of 0 by 1 n n Xi=1 gi(̂n) = 0; (2:1) where gi( ) = g(Xi; ) = u1(di) _ T ( ) 1( )(Xi ( )) +1 2 _ T ( )fu2(d2i )vec(T (Xi; )) vec( 1( ))g with di = d(Xi; ) = f(Xi ( ))T 1( )(Xi ( ))g 12 T (x; ) = 1( )(x ( ))(x ( ))T 1( ): The functions u1(t) and u2(t) are weight functions. In the context of maximum likelihood in an elliptical family, u1(t) = u2(t2) = 2 _ h(t2)=h(t2). In de ning ̂n in (2.1), we assume that the population 0 satis es Eg(X; 0) = 0. In an elliptically symmetric family, this is justi ed if ( ) is invariant under a constant scaling factor as de ned and discussed in Browne (1982, p.77). When there is no structure on and , exact assumptions for the existence and uniqueness of ̂n and ̂n were given by Maronna (1976). As noted by both Lopuha a (1991b) and Tyler (1991), when 2(t) = tu2(t) is a monotone function, which is assumption (C) of Maronna (1976, p. 53), there exists a unique M-estimator ̂n. However, as commented by Huber (1981, p. 223), the assumptions for uniqueness of the solutions to Maronna's equation (1.1) and (1.2) may be unrealistic for a speci c set of data. For one dimensional data generated from a Cauchy distribution, Fu (1989) demonstrated that the maximum likelihood equation can have multiple solutions at local maxima. 5 Since (2.1) is an estimating equation, we will use the results of Yuan and Jennrich (1995) and show that there is always a consistent solution of (2.1) for large n. We need the following assumptions. Assumptions 2: 2.1. ( ) and ( ) are twice continuously di erentiable and ( ) > I in a neighborhood of 0 for some > 0. 2.2. With probability one u1(d(X; )) and u2(d2(X; )) are continuously di erentiable with respect to and u1(d(x; )), u2(d2(x; )), _ u1(d(x; )) and _ u2(d2(x; )) are bounded. 2.3. X has nite fourth moment. 2.4. Eg(X; 0) = 0, V = var(g(X; 0)) > 0, A =E_ g (X; 0) is nonsingular. Assumptions 2.1 and 2.3 are similar as those made for maximum likelihood estimators based on normal theory. It is obvious that u1(t) and u2(t) corresponding to a multivariate t-distribution satisfy assumption 2.2. Since most of the weight functions used in linear models have a nite number of discontinuity points and are bounded together with their derivatives, they also satisfy assumption 2.2. Assumption 2.4 are moment assumptions about the model. Let Gn( ) = 1 nPni=1 gi( ). With this de nition, we have the following theorem. Theorem 2.1. Under assumptions 2.1 to 2.4, for any neighborhood N of 0, with probability one there are ̂n 2 N such that Gn(̂n) = 0 for all n su ciently large. Theorem 1 claims the existence of roots of (2.1). The following theorem is on consistency of the roots ̂n. 6 Theorem 2.2. Under assumptions 2.1 to 2.4, with probability one there are zeros ̂n of Gn( ) such that ̂n ! 0. Theorems 2.1 and 2.2 give the existence and consistency of a solution. If the solution for (2.1) is unique, it must be consistent. When equation (2.1) has multiple solutions, one needs to have a good algorithm to nd the consistent solution, as demonstrated by Fu (1989) for the maximum likelihood estimator of a Cauchy distribution. Since it is not our purpose here to explore algorithms for obtaining ̂n, we will assume that a consistent ̂n has been obtained in the rest of this paper. In order to evaluate hypothesies about 0, we need to obtain the asymptotic distribution of ̂n. This is given in the following theorem. Theorem 2.3. Let Gn(̂n) = 0 and ̂n P ! 0, then under assumptions 2.1 to 2.4, pn(̂n 0) L ! N(0; ); where = A 1V A T : To apply this result, we need to obtain a consistent estimator of . This can be obtained through consistent estimators of A and V which are given respectively by An = 1 n n Xi=1 _ gi(̂n) (2:2) Vn = 1 n n Xi=1 gi(̂n)gT i (̂n): (2:3) The consistency of An and Vn does not depend on the underlying distribution of the sample as long as assumption 2.4 holds. Estimators with this property will be refered 7 to as \distribution free" estimators later. If the data are from an elliptical distribution, the expression for A and V can be much more detailed. Corollary 2.1. Assuming that the underlying distribution of X is elliptical symmetric, then A( 0) = f(1pE[ _ u1(d)d] + Eu1(d)) _ T 1 _ +(1 2 + 1 p(p+2)E[ _ u2(d2)d4]) _ T ( 1 1) _ + 1 2p(p+2)E[ _ u2(d2)d4] _ vec( 1)vecT( 1) _ g (2:4) and V ( 0) = 1pE[u21(d)d2] _ T 1 _ + 1 2p(p+2)E[u22(d2)d4] _ T ( 1 1) _ +14f 1 p(p+2)E[u22(d2)d4] 1g _ vec( 1)vecT( 1) _ ; (2:5) where d = d(X; 0). So in an elliptical family, there are several choices for a consistent ̂n. For example, both V (̂n) and 1 n Pni=1 gi(̂n)gT i (̂n) are consistent estimators for V . Similarly, we can estimate A by either A(̂n) or 1 n Pni=1 _ gi(̂n). Inside these estimators, can be estimated by either (̂n) or ̂n from an unstructured model. Even though asymptotically each combination gives a consistent estimator for , these various alternatives may have nonignorable di erences with small to medium sample sizes. The in uence function of ̂n is IF (x) = A 1g(x; 0); which can be obtained by either equation (4.2.9) and (4.2.10) of Hampel et al (1986, p. 230) or using the technique demonstrated in Ser ing (1980, Section 6.5). It can be seen that in order to get an M-estimator with a bounded in uence function, both 8 u1(t) and u2(t) should go to zero at least as fast as 1=t. We can choose u1(t) and u2(t) either from the weight functions used in linear models, e.g., the Huber-type -function, Hampel's redescending -function, etc., or by the use of some elliptical distribution, e.g., the multivariate t-distribution (Kano 1994; Kent, Tyler, and Vardi 1994). We will give more details in Section 6. 2.2 Structured covariance only As we have mentioned, an important special case is a structured covariance with an unstructured mean. Let T = ( T ; T ). The equation (2.1) becomes the following two equations 1 n n Xi=1 u1(di)(Xi ) = 0; (2:6) 1 n n Xi=1 _ T ( )fu2(d2i )vec(T (Xi; )) vec( 1( ))g = 0: (2:7) Let gi( ) = u1(di)(Xi ) _ T ( )fu2(d2i )vec(T (Xi; )) vec( 1( ))g ! : The asymptotic distribution of ̂n, which satis es (2.6) and (2.7), is given by pn( ̂n 0) L ! N(0; ); where = A 1V A T with A=E_ gi( 0) and V = var(gi( 0)). An interesting specialization of the asymptotic distribution of ̂n occurs when the sample is from an elliptical distribution. Let a = 1 p(p + 2)E[ _ u2(d2)d4]; b = 1 p(p + 2)E[u22(d2)d4]; 9 c = pE[u21(d)d2] fpEu1(d) + E[ _ u1(d)d]g2 ; and p1 = vecT ( 1) _ f _ T ( 1 1) _ g 1 _ Tvec( 1): Corollary 2.2. Assume the underlying distribution of X is elliptical symmetric. Then ̂n and ̂n are asymptotically independent with pn(̂n 0) L ! N(0; c ); and pn(̂n 0) L ! N(0; 22); where 1 22 = (1 + 2a)2 2b _ T ( 1 1) _ + (p1) _ Tvec( 1)vecT ( 1) _ ; (2:8) and (p1) = (1 + 2a)2 + 2a2b(2 + p1) b 2b[(2 + p1)b p1] : In an elliptical family, the in uence functions of ̂n and ̂n can be separated. The in uence function of ̂n was given by Maronna (1976), which is IF̂(x) = u1(d(x; 0))(x 0) Eu1(d(X; 0)) + 1 pE[ _ u1(d(X; 0))d(X; 0)] : The in uence function of ̂n is IF̂n(x) = A 1 22 _ fu2(d(x)vec(T (x; 0)) vec[ 1( 0)]g; (2:9) where A22 = (1 + 2a) _ T ( 1 1) _ + a _ Tvec( 1)vecT( 1) _ : It can be shown that after some standardization and setting u2(t) = 1, (2.9) is equivalent to the in uence function obtained by Tanaka et al (1991) in their equation 10 (6). They investigated the empirical in uence on ̂n of individual observations. Since u2(t) = 1, IF̂n(x) is quadratic in x. The mean parameter is a nuisance parameter in covariance structure analysis, and ̂n and ̂n are asymptotically independent, so we can use any pn-consistent estimator ̂n in equation (2.7) and solve for ̂n only. This will not in uence the asymptotic distribution of ̂n. For example, we may use the marginal median or the marginal trimmed mean as ̂n. If one is worried about losing the equivariance of ̂n, one can solve rst for ̂n and ̂n as de ned in equations (1.1) and (1.2) of Maronna (1976) to obtain an equivariant estimator ̂n and then solving for ̂n through (2.7). But this may not be a lot simpler than solving (2.6) and (2.7) simultaneously. 3 S-estimators of multivariate location and covariance Even though M-estimators have bounded in uence functions by choosing proper weight functions u1(t) and u2(t), an unsatisfactory aspect of M-estimators is their low breakdown point in high dimensions. It has been demonstrated (e.g., Maronna 1976, Huber 1981) that the breakdown point of an M-estimator is at most 1=(p+1). Because of the relationship between S-estimators and M-estimators (Lopuha a 1989), Lopuhaa (1991b) and Tyler (1991) argued that when an estimating equation de ning M-estimators enjoys multiple solutions, some solutions may have higher breakdown points, though it is not clear how to identify such solutions. Because of the limited breakdown point of M-estimators in high dimensional data, other robust procedures have been proposed to estimate 0 and 0. Most of the meth11 ods mentioned in Section 1 possess a high breakdown point. In order to obtain better robustness, we discuss some alternatives. Letting ̂n be any consistent estimator of 0, we need to assume that an(̂n ( 0)) L ! N(0; C( )) (3:1) for some an !1. Among all the high breakdown point estimators mentioned above, only the MCD estimator, S-estimator and -estimator satisfy such an assumption. The other estimators, either because of unknown asymptotic properties or a nonnormal limiting distribution, do not satisfy (3.1), according to Lopuhaa (1991b) and Tyler (1991). For example, the minimum distance estimator is biased according to Tamura and Boos (1986). In order to obtain a high breakdown point estimator and for motivating the next section, a brief review of S-estimators, based mainly on Lopuhaa (1989, 1991b), will be given. Since the form of the asymptotic covariance of an S-estimator is almost the same as that of an M-estimator, a consistent covariance matrix of the estimators is easily obtained. As in the last section, we will contrast the di erent estimators and their underlying assumptions. Let (t) be a function on real line which satis es: Assumptions 3: 3.1. (t) is symmetric, has continuous derivative (t) and (0) = 0. 3.2. There exists a nite c0 > 0 such that (t) is strictly increasing on [0; c0] and constant on [c0;1). Under these assumptions, Lopuhaa (1989) de ned S-estimators (̂n; ̂n) as a solution 12 to minimizing det( ) subject to 1 n n Xi=1 [f(Xi )T 1(Xi )g 12 ] = b0; (3:2) where 0 < b0 < sup (t). Further conditions on existence, consistency, and asymptotic normality of (̂n; ̂n) were given in Lopuhaa (1989). Especially, (̂n; ̂n) satisfy the following two equations: 1 n n Xi=1 u(di)(Xi ) = 0 (3:3) 1 n n Xi=1 pu(di)(Xi )(Xi )T (di) = 0; (3:4) where u(t) = (t)=t and (t) = t (t) (t) + b0. It will be apparent that (3.3) and (3.4) are equations of the type that de ne M-estimators. Hence, the asymptotic distribution of (̂n; ̂n) will be similar to that of M-estimators. Let (x; ) = u(d)(x ) pu(d)vech[(x )(x )T ] (d)vech( ) ! ; then the asymptotic distribution of ̂n is given by pn(̂n 0) L ! N(0; ); where = A 1 A T with A =E _ ( 0) and = var( (Xi; 0)). Consistent estimators of A and are An = 1 n n Xi=1 _ (Xi; ̂n); (3:5) n = 1 n n Xi=1 (Xi; ̂n) T (Xi; ̂n): (3:6) The consistency of An and n does not depend on the underlying distribution of Xi as long as the expectations A =E _ (X; 0) and = var( (Xi; 0)) exist. So pn(̂n 0) L ! N(0; 22); (3:7) 13 where 22 is the lower right p p submatrix of . It can be consistently estimated by the corresponding submatrix of any consistent ̂n. The asymptotic variance can be further simpli ed if the sample is from an elliptical distribution. In this case ̂n and ̂n are asymptotically independent with pn(̂n 0) L ! N(0; 1 ); (3:8) where 1 = pE 2(d)=f(p 1)Eu(d) + E _ (d)g2, and pn(̂n 0) L ! N(0; 22); (3:9) where 22 = 2D+ p ( )D+T p + 3D+ p vec( )vecT( )D+T p (3:10) with 2 = 2p(p + 2)E 2(d)d2 [E _ (d)d2 + (p+ 1)E (d)d]2 ; 3 = 2 p 2 + 4E( (d) b0)2 [E (d)d]2 ; and Dp is a p2 p constant duplication matrix as de ned in Magnus and Neudecker (1988, p. 49). As with M-estimators, the in uence functions of S-estimators are bounded by choosing a proper function (t). The breakdown point of S-estimators is not limited by the dimension of the data and can be as high as approximately 1=2. Generally, the asymptotic e ciency of an S-estimator is related to its breakdown point and it is impossible to obtain high e ciency with a breakdown point around 1=2. However, there exist functions (t) which can obtain the same e ciency as an M-estimator but a higher breakdown point. This is demonstrated in Lopuhaa (1989), by using Tukey's biweight function as the (t). 14 In principle, we could de ne an S-estimator of 0 as was done by Lopuhaa (1989) or Davis (1987). Since minimizing det( ( )) under a structured constraint as in (3.2) in the space of does not have the obvious statistical meaning as with standard S-estimators of ( ; ), we prefer using an S-estimator ̂n to do covariance structure analysis as in the next section. 4 A two-stage estimation procedure Suppose a robust estimator ̂n satisfying (3.1) is available. For each n, let Mn be a weight matrix. We de ne Fn( ) = (̂n ( ))TMn(̂n ( )) (4:1) and the estimator ~ n that satis es Fn(~ n) = min 2 Fn( ) will be called a minimum chi-square estimator. This term follows from Ferguson (1958) who de ned a minimum chi-square estimator when, in (3.1), an = pn. So we can use Ferguson's (1958) result for consistency and asymptotic normality of ~ n if an = pn. Since some robust covariance with unknown theoretical properties may nd to possess an-consistency, we will present our application in a little more general setting. For example, we do not need the asymptotic normality of ̂n in order to obtain consistency of ~ n. The following assumptions are required. Assumptions 4: 4.1. is compact. 4.2. ( ) is identi ed, that is ( 1) = ( 2), 1; 2 2 implies 1 = 2. 4.3. ( ) is continuously di erentiable on and _ ( 0) is of full column rank. 4.4. Mn P !M > 0. 15 The following theorem is on the consistency of the minimum chi-square estimator ~ n. Theorem 4.1. Under assumptions 4.1, 4.2, 4.4 and assuming the continuity of ( ), if an(̂n ( 0)) = Op(1) for some an !1, then ~ n P ! 0. For obtaining ~ n, we usually need to solve _ T ( )Mn(̂n ( )) = 0 (4:2) and the solution is indeed ~ n if it is unique. For the asymptotic normality of ~ n, we need to assume the asymptotic normality of ̂n. Theorem 4.2. Under assumptions 4.3 and 4.4, and assuming an(̂n ( 0)) L ! N(0; C) for some an !1, thenan(~ n 0) L ! N(0; ); where = ( _ TM _ ) 1 _ TMCM _ ( _ TM _ ) 1: (4:3) When C is nonsingular and a consistent estimator Ĉn is available, we can choose Mn = Ĉ 1 n and (4.3) is simpli ed to = ( _ TC 1 _ ) 1: (4:4) This is the covariance matrix of the minimum variance estimator. The robust properties of ~ n will follow from those of ̂n. LetH( ; ) = _ T ( )M( ( )), then _ H ( ; ) is nonsingular in a neighborhood of ~ n. So ~ n is an implicit function of ̂n decided by H(~ n; ̂n) = 0. Since _ ( ) = ( _ H ( ; )) 1 _ H ( ; ); 16 which approaches ( _ TM _ ) 1 _ TM at the estimated value, the in uence function of the minimum chi-square estimator ~ n is IF (x) = f( _ TM _ ) 1 _ TMgIF (x): Thus ~ n inherits the local robust properties of ̂n. Actually ~ n depends on both ̂n and Mn. Since ~ n can not be expressed as an implicit function of M , we can not give the in uence function of ~ n with respect to that of Mn. Since ~ n is more sensitive to ̂n than to Mn, which can be seen from the speed of convergence required for ̂n and Mn, ~ n will have a bounded in uence function as long as both ̂n and Mn do. And, since is a continuous function of , and _ ( ) is bounded in a neighborhood of ̂n, it is obvious that ~ n also possesses the global sensitivity of ̂n as long as Mn is of high breakdown point. If ̂n is from an M-estimator instead of from an S-estimator, we can proceed in the same way to get an estimator ~ n of 0 through the above two stage process. An immediate question is which estimator is more e cient: a direct M-estimator ̂n, or a two stage estimator ~ n based on an M-estimator ̂n. We answer this question in the context of an elliptical distribution. For this we need the asymptotic distribution of an M-estimator of ̂n, which can be obtained through Corollary 2 by letting = , or using the result in Tyler (1982), pn(̂n 0) L ! N(0; C); whereC 1 = (1 + 2a)2 2b DT p ( 1 1)Dp + (p)DT p vec( 1)vecT ( 1)Dp: (4:5) Using (4.4), the asymptotic covariance of the minimum chi-square estimator ~ n is given by 1 = (1 + 2a)2 2b _ T ( 1 1) _ + (p) _ Tvec( 1)vecT ( 1) _ : (4:6) 17 Comparing (4.6) with (2.8), we can see that the asymptotic e ciency of a minimum chi-square estimator based on an M-estimator ̂n and the direct M-estimator de ned in Section 2 will be the same if p1 = p. As we show next, this condition will be satis ed in a variety of models. Theorem 4.3. In an elliptical distribution, the asymptotic e ciency of a minimum chi-square estimator based on an M-estimator ̂n and the direct M-estimator de ned in Section 2 will be the same if ( 0) = c1 _ 1 + : : :+ cq _ q; (4:7) where _ j = _ j ( 0). It is obvious that linear covariance structure models satis es (4.7). For the AR(1) structure ( ) = 1( ji jj 2 ) since 10 _ 1 = ( 0), (4.7) is satis ed. For most factor analysis models = T + ; (4.7) is satis ed unless parameters beyond the minimal set to achieve identi cation are set to prespeci ed nonzero values. 5 Test As mentioned earlier, we need some quantitative method to judge the adequacy of a proposed model. Two such procedures will be proposed in this section. The rst one is in the context of the M-estimator ̂n de ned in Section 2. The second one is the minimum chi-square test which can be applied to any an-consistent and 18 asymptotically normal estimators of covariance. 5.1 Test for M-estimators We will present our test statistics in the context of structured means and structured covariances. Models with structured covariances but unstructured means are special cases. Let ̂n be the M-estimator of 0 de ned in Section 2. Let ̂n be the M-estimator of 0 de ned by 1 n n Xi=1 si(̂n) = 0; where si( ) = 0@ u1(di) 1(Xi ) 12DT p fu2(d2i )vec[ 1(Xi )(Xi )T 1] vec( 1)g 1A : The asymptotic variance of ̂n is = B 1 B T with = var(si( 0)) and B = E _ si( 0). Let ̂n be any consistent estimator of . Let _ c( ) be a (p+p ) (p+p q) matrix whose columns are orthogonal to those of _ ( ). Theorem 5.1. Under the assumptions 2.1 to 2.4 for both the structured and unstructured models, if ̂n and ̂n are consistent solutions, then Tn = n(̂n (̂n))T _ c(̂n)f _ T c (̂n) ̂n _ c(̂n)g 1 _ T c (̂n)(̂n (̂n)) (5:1) is distributed as 2p+p q under the null hypothesis of a correct structure. Under a sequence of local alternatives, it can be shown that Tn approaches a noncentral chi-square variate as shown in Chapter 3 of Yuan (1995). We will not pursue the details here. Using Lemma 1 of Khatri (1966), (5.1) can be rewritten as Tn = n(̂n (̂n))T ̂ 1 n (̂n (̂n)) n(̂n (̂n))T ̂ 1 n _ (̂n)f _ T (̂n) ̂ 1 n _ (̂n)g 1 _ T (̂n) ̂ 1 n (̂n (̂n)) : (5:2) 19 Comparing with (5.1), (5.2) is more easy to compute. 5.2 Minimum chi-square test In Section 4, we presented a two stage procedure through a minimum chi-square estimator. The minimized index Fn(~ n) can also be used as a test statistic. Let C be nonsingular with a consistent estimator Ĉn and Mn = Ĉ 1 n . Theorem 5.2. Under assumptions 4.1 to 4.4, if M = C 1 and ~ n is a consistent solution, then a2nFn(~ n) L ! 2p q under the null hypothesis of a correct structure. By a straight forward argument, it can be shown that a2nFn(~ n) is distributed as a noncentral chi-square under some local alternatives 1 = 0 + Op( =an). Since noncentral distributions are not so interesting from a practical point of view, we will not pursue this topic further here. Note that both statistics in this section are asymptotically distribution free if ̂n and Ĉn are distribution free estimators. Furthermore, as discussed in Sections 2 and 3, several consistent estimators for and C exist if the sample is from an elliptical distribution. These will correspond to di erent test statistics. Even though they are all asymptotically chi-square distributed, with small to medium sample sizes, these statistics may have signi cantly di erent type I errors as demonstrated in Yuan and Bentler (1995). This problem is now under further study through simulation. 6 An Example In this section, we will apply some of the robust procedures developed previously to 20 the open and closed book data set from Mardia, Kent, and Bibby (1979, Table 1.2.1). The data set consists of n=88 cases and p=5 variables. The covariance structure of this data set has been con rmed by Tanaka et al (1991) as a two factors model with the rst two variables depend on the rst fator and the last three variables depend on the second factor. Speci cally, let X = (x1; : : : ; x5)T , then X = F + E, where T = T ( ) = 1 2 0 0 0 0 0 3 4 5 ! ; var(F ) = ( ) = 1 6 6 1 ! ; and var(E) = ( ) = diag( 7; : : : ; 11). So the structured covariance is ( ) = ( ) ( ) T ( ) + ( ) (6:1) The normal theory maximum likelihood estimator has been obtained by Tanaka et al and sensitivity analysies by Tanaka et al (1991), Cadigan (1995), and Lee and Wang (1995) indicate that the 81st case is the most in uential point. We will t (6.1) by the two stage procedure through minimum chi-square as discussed in Sections 4 and 5, with M-estimators ̂n. We use this illustration because no e cient algorithms are available for most high breakdown point estimators as noted by Tyler (1991). Also, M-estimators of are straightforward and have been shown to work well in a variety of settings, e.g. principal component analysis (Devlin et al 1981) and canonical variates (Campbell 1980; 1981). Another reason for us to choose the two-stage procedure is because the algorithm for estimating 0 can be easily implemented in standard statistical packages by simple modi cations to the covariance matrix and the \distribution free" weight matrix used classically. Six di erent weights were used for M-estimators ̂. Two of these are from tp(df; ; ), a p-variate t-distribution with degrees of freedom df . The corresponding 21 weight functions are w1(di) = w2(d2i ) = (p+ df)=(df + d2i ): (6:2) We choose df = 1 and df = 5 respectively, with df = 1 corresponding to the Cauchy distribution and df = 5 corresponding to the minimum integer degrees of freedom for which the 4th moment of a multivariate t-distribution exists. Note that since we assume that the 4th moment of X exists, we can not assume that the data is from a Cauchy distribution. Here we use the Cauchy density only to get weight functions that guard against observations with large residuals. Two of the Huber(q) type of weights were used with w1(di) = ( 1; if di k1 k=di; if di > k1 (6:3) and w2(d2i ) = fw1(di)g2=k2, where k1 is the qth percentile of 2p and k2 is a constant satisfying E 2pw2( 2p) = p. We choose q1 = 5 and q2 = 10 respectively for the Huber weights. Campbell (1980) de ned the M-estimator (̂n; ̂n) through ̂n = n Xi=1wiXi= n Xi=1wi; (6:4) ̂n = n Xi=1w2 i (Xi ̂n)(Xi ̂n)T=( n Xi=1w2 i 1); (6:5) where wi = w(di) = !(di)=di; and !(di) = ( di; if di d0 d0expf :5(di d0)2=b22g; if di > d0 (6:6) 22 with d0 = pp+b1=p2. Campbell's recommendation for the constants are: (1) b1 = 2, b2 =1 which corresponds to Huber-type weights; (2) b1 = 2, b2 = 1:25 which corresponds to Hampel-type weights. The \distribution free" weight matrices were used for all these six weights. The 81st case was identi ed as the one with the smallest weight with all these six weight functions. The estimated parameters together with their standard errors and the robust test statistics are in Table 1, where C-Ha (Campbell-Hampel) and C-Hu (Campbell-Huber) correspond to the weight functions used by Campbell with b1 = 2 and b2 = 1:25, 1 respectively. From Table 1, we can see that di erent weight functions generally give di erent estimators and di erent test statistics. Huber(10) gives the largest chi-square statistic with a corresponding p-value of :042, which suggests that the model ts the data reasonably well. The t indices corresponding to other weight function are far from signi cant, so the null hypothesis (6.1) is not rejected. The multi-t(1) gives the smallest t index. In most of the estimators by robust procedures, the standard errors are usually smaller than those of the corresponding normal theory MLE. The factor loading and unique variance estimators corresponding to multi-t(1) are smaller than those by other procedures, while the estimator of the factor correlation by multi-t(1) is the largest. Huber weights and Campbell weights give larger factor loadings than those by a multivariate t-distribution. No overall preference in solution is obvious when comparing the two Huber weights, the Campbell-Huber and Campbell-Hampel weights, and the di erent Huber and Campbell-Huber and Campbell-Hampel weights. A possible choice among these weights could be made by choosing the estimator with minimum overall variance, e.g., based on the smallest trace or determinant of the estimated asymptotic covariance matrix of the estimator. This possibility is not explored further here. 23 7 Discussion In social and behavioral sciences, the data collection process is usually rough. Bad data or outliers commonly exist. Researchers de nitely are aware of this problem, but a coherent strategy for dealing with the problem has not been advanced. Outlier detection or sensitivity analysis has been recommended as a precursor to a formal estimation procedure, though such a method has its own drawback (e.g., Bollen 1989). Assuming the quality of the data is good enough, the GLS procedure gives correct inference when sample size is large enough. However, if most of the data in a sample are good while a few outliers exist, the GLS procedure can totally fail because of its quadratic in uence functions. Furthermore, when a sample is from a distribution with medium to large kurtosis, the GLS estimators can be highly ine cient, while proper downweighting to cases with large residuals usually should lead to more e cient estimators. Robust methods have been developed and used in many di erent areas of statistics. Theory for robustly estimating multivariate location and scatter also has been studied extensively. While robust methods have been used in principal components and canonical variates, the application of robust methods to covariance structure analysis has been rather slow. This may be because robust methods in principal components and canonical variates are just eigenvalue-eigenvector decompositions of robusti ed matrices, which are rather straight forward, while a complete theory for robust covariance structure analysis involving estimation and testing is relatively more complicated. As we have mentioned, our two-stage procedure can be easily adapted into EQS (Bentler 1995) or LISREL (Joreskog and Sorbom 1993), the major software packages in this eld. Greater availability of such new methods hopefully could 24 promote the application of robust procedures in covariance structure analysis in the social and behavioral sciences. Covariance structure analysis is usually used to model high-dimensional data sets. In this situation, a high breakdown point estimator like the S-estimator is necessary when a sample has problematic observations. Since computational problems exist for all known estimators of multivariate location and scatter with high breakdown point, this may limit their immediate use in covariance structure analysis. We hope some e cient algorithms for this situation will be developed in the near future. Since the data set we used is of dimension 5, and only the 81st case has been identi ed as an in uential point, there may not be much di erence between the S-estimator and an M-estimator for this data set. But in general applications of covariance structure analysis, high breakdown point estimators like the S-estimator will be necessary to get a robust analysis. The results we obtained in our empirical study yield no clear winner among various weighting functions used to de ne our robust M-estimators. These results may or may not apply to other data sets. Further study, which may based on extensive empirical simulation, on di erent models, and on di erent underlying distributions is necessary to obtain a better understanding of the general merits of these di erent weights. 8 Appendix All the theorems will be proved in this appendix. We need some lemmas for the proof of Theorems 2.1 and 2.2. The following lemma is from Theorem 2 of Jennrich (1969). Essentially the same result has been stated in Theorem 4.2.1 of Amemiya (1985). 25 Lemma A.1. Let X1, X2, : : : be an i.i.d. sample, and let h(x; ) be measurable in x and continuous in 2 , a compact set. If Esup 2 h(X; ) < 1, then with probability one 1 nPni=1 h(Xi; ) converges to Eh(X; ) uniformly for all 2 . Let hi( ), i = 1; 2; : : : be independent p-variate stochastic functions of 2 and Hn( ) = 1 n n Xi=1 hi( ): Some results of Yuan and Jennrich (1995) will be used. We need the following assumptions for the next two lemmas. Assumptions A: A.1. Hn( 0)! 0 with probability one. A.2. There is a neighborhood N of 0 on which with probability one Hn( ) are continuously di erentiable and the Jacobians _ Hn( ) converge uniformly to a nonstochastic function which is nonsingular at 0. The following two lemmas are from Theorems 1 and 2 of Yuan and Jennrich (1995). Lemma A.2. Under assumptions A.1 and A.2, with probability one for any r > 0 there are ̂n 2 Br( 0) such that Gn(̂n) = 0 for all n su ciently large. Lemma A.3. Under assumptions A.1 and A.2, with probability one there are zeros ̂n of Hn( ) such that ̂n ! 0. Proof of Theorems 2.1 and 2.2: Since X1, : : :, Xn is an i.i.d. sample, assumption 2.4 implies assumption A.1. So we only need to check that assumptions 2.1 to 2.4 imply assumption A.2. Assumption 2.4 implies that E _ g (X; ) exist in a neighborhood of 0 and A=E_ g (X; 0) is nonsingular, by Lemma A.1 we only need to 26 show that _ g (x; ) is bounded by an integrable function. Let A1(x; ) = _ u1(d(x; )) 2d(x; ) f2 _ T ( )T (x; ) _ ( ) + _ T ( ) 1( )(x )vecT (T (x; )) _ ( )g ; A2(x; ) = u1(d(x; ))f[ @ @ T ( _ T ( ) 1( ))](x ( )) _ T ( ) 1( ) _ ( )g; A3(x; ) = 12[ @ @ T _ T ( )]fu2(d2(x; ))vec(T (x; )) vec( 1( ))g; A4(x; ) = 1 2 _ u2(d2(x; )) _ T ( )vec(T (x; )) f2(x )T 1( ) _ ( ) +vecT (T (x; )) _ ( )g ; A5(x; ) = 1 2 _ T ( )fu2(d2(x; ))vec( _ T (x; )) ( 1( ) 1( )) _ ( )g; then _ g (x; ) = A1(x; ) +A2(x; ) +A3(x; ) +A4(x; ) +A5(x; ); where vec( _ T (x; )) = (fT (x; ) 1( )g _ ( ) + f 1( ) T (x; )g _ ( ) +f[ 1( )(x )] 1( )g _ ( ) +f 1( ) [ 1( )(x )]g _ ( )) : Since X has a nite fourth moment, ( ) > I in a neighborhood of 0, both d(x; ), d2(x; ) are bounded by integrable functions in a neighborhood of 0. Since jjA1(x; )jj j _ u1(d(x; ))jfd(x; )jj _ T ( ) 1( ) _ ( )jj +1 2d2(x; )jj _ T ( ) 1( ) _ ( )jj 1 2 jj 1 2 ( )jj2jj _ jjg ; there is a neighborhood of 0 on which A1(X; ) is bounded by an integrable function. Similarly as A1(x; ), it can be shown that A2(x; ) to A5(x; ) are bounded by integrable functions in a neighborhood of 0, so is _ g (X; ). Proof of Theorem 2.3: Using a Taylor expansion on Gn( ), we have Gn(̂n) = Gn( 0) + _ Gn( n)(̂n 0); (A:1) 27 where _ Gn( n) denotes that each row of _ Gn( ) is evaluated at a consistent n of 0. Since V = var(gi( 0)) > 0, the multivariate version of classical central limit theorem implies pnGn( 0) L ! N(0; ): Since _ Gn( n) P ! A = E _ gi( 0) and A is nonsingular, (A.1) can be written as pn(̂n 0) = A 1pnGn( 0) + op(1): (A:2) So the theorem follows from (A.2) and the Slutsky theorem. Proof of Corollaries 2.1 and 2.2: Since the proof of these two corollaries are similar, and both involve tedious but direct calculations, we only list the key properties of an elliptical distribution used in the proofs. Let X = (x1; : : : ; xp) obey an elliptical distribution, then Exri i xrj j xrk k xrl l = 0; (A:3) if r = ri+ rj + rk + rl is an odd number. Let d(X) = f(X 0)T 1 0 (X 0)g 1 2 and U = 12 0 (X 0)=d(X), then d(X) and U are independent and EUUT = 1 pI (A:4) and Evec(UUT )vecT(UUT ) = 1 p(p + 2)fIp2 +Kp + vec(Ip)vecT (Ip)g; (A:5) where Kp is the commutation matrix de ned in Magnus and Neudecker (1988, p. 46). Equations (A.3) and (A.4) can be found in Fang et al (1989, p. 34) and equation (A.5) was proved in Magnus and Neudecker (1979). Another property used is Ed2(X)u2(d2(X)) = p, this follows since the estimating equation is unbiased. With these properties, the proofs of these two corollaries are straight forward. 28 Proof of Theorem 4.1: Since an !1, ( ) is continuous, and Mn is consistent, we have Fn( ) Fn( 0) = 2(̂n ( 0))TMn( ( 0) ( )) +( ( ) ( 0))TMn( ( ) ( 0)) P ! ( ( ) ( 0))TM( ( ) ( 0)) (A:6) uniformly in 2 . Let ~ ni be any convergent subsequence of ~ n with limit 0, then Fni(~ ni) Fni( 0) 0: (A:7) Let ni !1 in (A.7), then( ( 0) ( 0))TM( ( 0) ( 0)): So 0 = 0. Since ~ ni can be any convergent subsequence, ~ n P ! 0. Proof of Theorem 4.2: From _ T (~ n)Mn(̂n (~ n)) = 0; it follows that _ T (~ n)Mn(̂n ( 0)) = _ T (~ n)Mn( (~ n) ( 0)) = _ T (~ n)Mn _ ( n)(~ n 0); (A:8) where _ ( n) denotes that each row of _ ( ) is evaluated at a consistent estimator n of 0. Since _ T (~ n)Mn _ ( n) P ! _ TM _ , which is positive de nite, (A.8) can be rewritten as an(~ n 0) = [ _ T(~ n)Mn _ ( n)] 1 _ T (~ n)Mnan(̂n ( 0)): (A:9) The theorem follows from (A.9) and the Slutsky theorem. 29 Proof of Theorem 4.3: Since vec( 1) = ( 12 12 )vec(I); p1 = vecT (I)P vec(I), where P = ( T ) 1 T is the usual projection matrix formed by = ( 12 12 ) _ . If vec(I) is in the space of , then p1 = p. From = (vec( 1 2 _ 1 12 ); : : : ; vec( 1 2 _ q 12 )); vec(I) being in the space of is equivalent to c1 1 2 _ 1 12 + : : :+ cq 1 2 _ q 12 = I for some c1, : : :, cq. That is can be expressed as = c1 _ 1 + : : :+ cq _ q: Proof of Theorem 5.1: From (A.2), pn(̂n 0) = A 1pnGn( 0) + op(1): (A:10) Since an unstructured model is also a structure, we have pn(̂n 0) = B 1pnSn( 0) + op(1); (A:11) where Sn( ) = 1 nPni=1 si( ). Direct calculation shows Gn( ) = _ T ( )Sn( ( )): (A:12) From (A.10), (A.11) and (A.12), we have pn( (̂n) ( 0)) = _ pn(̂n 0) + op(1) = _ A 1 _ TBpn(̂n 0) + op(1); (A:13) 30 and consequentlypn(̂n (̂n)) = (I _A 1 _TB)pn(̂n0) + op(1):(A:14)Sincepn(̂n0) L! N(0; );(A:15)(A.14) and (A.15) imply pn(̂n (̂n)) L! N(0; V );(A:16)whereV = (I _A 1 _TB) (I _A 1 _TB)T :Since the columns of _c( ) are orthogonal to those of _( ), then_Tc (̂n)pn(̂n (̂n)) L! N(0; _Tc ( 0)V _c( 0));and_Tc ( 0)V _c( 0) = _Tc ( 0) _c( 0):So the statisticTn = n(̂n (̂n))T _c(̂n)f _Tc (̂n) ̂n _c(̂n)g 1 _Tc (̂n)(̂n (̂n))is distributed as2p+p q under the null hypothesis of a correct structure.Proof of Theorem 5.2:From (A.9) it followsan( (~n) ( 0)) = _( _TC 1 _) 1 _TC 1an( n ( 0)) + op(1):(A:17)Soan(̂n (~n)) = an(̂n ( 0)) + an( ( 0) (~n))= fI _( _TC 1 _) 1 _TC 1gan( n ( 0)) + op(1);31 anda2nFn(~n) can be written asa2nFn(~n) = an( n (0))TC 12PC 12an( n ( 0)) + op(1);(A:18)whereP = fI C 12 _( _TC 1 _) 1 _TC 12gis a projection matrix of rank p q. The theorem follows from (A.18).References[1] Amemiya, T. (1985), Advanced Econometrics, Massachusetts: Harvard Univer-sity Press.[2] Bentler, P. M. (1995), EQS Structural Equations Program Manual, Encino, CA:Multivariate Software.[3] Bentler, P. M., and Dudgeon, P. (1996), \Covariance structure analysis: Statisti-cal practice, theory, and directions," Annual Review of Psychology, 47, 541{570.[4] Bollen, K. A. (1989), Structural Equations with Latent Variables, New York:Wiley.[5] Browne, M. W. (1982), \Covariance structures," in Topics in Applied Multivari-ate analysis, ed. D. M. Hawkins, Cambridge: Cambridge University Press, pp.72{141.[6] Browne, M. W. (1984), \Asymptotic distribution-free methods for the analysisof covariance structures," British Journal of Mathematical and Statistical Psy-chology, 37, 62{83.32 [7] Cadigan, N. G. (1995), \Local in uence in structural equation models," Struc-tural Equation Modeling, 2, 13{30.[8] Campbell, N. A. (1980), \Robust procedures in multivariate analysis I: Robustcovariance estimation," Applied Statistics, 29, 231{237.[9] Campbell, N. A. (1982), \Robust procedures in multivariate analysis II: Robustcanonical variate analysis," Applied Statistics, 31, 1{8.[10] Chamberlain, G. (1982), \Multivariate regression models for panel data," Journalof Econometrics, 18, 5{46.[11] Davies, P. L. (1987), \Asymptotic behavior of S-estimators of multivariate loca-tion parameters and dispersion matrices," The Annals of Statistics, 15, 1269{1292.[12] Devlin, S. J., Gnanadesikan, R., and Kettenring, J. R. (1981), \Robust estima-tion of dispersion matrices and principal components," Journal of the AmericanStatistical Association, 76, 354{362.[13] Donoho, D. L. (1982), \Breakdown properties of multivariate location estima-tors," Ph.D. qualifying paper, Department of Statistics, Harvard University.[14] Donoho, D. L., and Liu, R. C. (1988a), \The \automatic" robustness of minimumdistance functions," The Annals of Statistics, 16, 552{586.[15] Donoho, D. L., and Liu, R. C. (1988b), \Pathologies of some minimum distanceestimators," The Annals of Statistics, 16, 587{608.[16] Fang, K. T., Kotz, S., and Ng, K. (1989), Symmetric Multivariate and RelatedDistributions, London: Chapman & Hall.33 [17] Ferguson, T. (1958), \A method of generating best asymptotically normal esti-mates with application to the estimation of bacterial densities," Annals of Math-ematical Statistics, 29, 1046{1062.[18] Fu, J. C. (1989), \Method of KIM-ZAM: an algorithm for computing the maxi-mum likelihood estimator," Statistics & Probability Letters, 8, 289{296.[19] Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., and Stahel, W. A. (1986),Robust Statistics: The Approach based on In uence Functions, New York: Wiley.[20] Hu, L, Bentler, P. M., and Kano, Y. (1992), \Can test statistics in covariancestructure analysis be trusted?" Psychological Bulletin, 112, 351{362.[21] Huber, P. J. (1977), \Robust covariances," in Statistical Decision Theory andRelated Topics, Vol. 2, eds. S. S. Gupta and D. S. Moore, New York: AcademicPress, pp. 165{191.[22] Huber, P. J. (1981), Robust Statistics, New York: Wiley.[23] Jennrich, R. I. (1969), \Asymptotic properties of non-linear least squares esti-mators," Annals of Mathematical Statististics, 40, 633{643.[24] Joreskog, K. G., and Sorbom, D. (1993), LISREL 8 User's Reference Guide,Chicago: Scienti c Software International.[25] Kano, Y. (1994), \Consistency property of elliptical probability density func-tions," Journal of Multivariate Analysis, 51, 139{147.[26] Kent, J. T., Tyler, D. E., and Vardi, Y. (1994), \A curious likelihood identity forthe multivariate t-distribution," Communications in Statistics-Simulation andComputation, 23, 441{453.[27] Khatri (1966), \A note on a MANOVA model applied to problems in growthcurves," Annals of the Institute of Statistical Mathematics, 18, 75{86.34 [28] Lee, S. Y., and Wang, S. J. (1995), \Sensitivity analysis of structural equationmodels," Psychometrika, in press.[29] Lopuhaa, H. P. (1989), \On the relation between S-estimators and M-estimatorsof multivariate location and covariances," The Annals of Statistics, 17, 1662{1683.[30] Lopuhaa, H. P. (1991a), \ -estimators for location and scatter," The CanadianJournal of Statististics, 19, 307{321.[31] Lopuhaa, H. P. (1991b), \Breakdown point and asymptotic properties of mul-tivariate S-estimators and -estimators: a summary," in Directions in RobustStatistics and Diagnostics: Part I, eds. W. Stahel and S. Weisberg, New York:Springer-Verlag, pp. 167{182.[32] Magnus, J. R., and Neudecker, H. (1979), "The commutation matrix: someproperties and applications," The Annals of Statistics, 7, 381{394.[33] Magnus, J. R., and Neudecker, H. (1988), Matrix Di erential Calculus with Ap-plications in Statistics and Econometrics, New York: Wiley.[34] Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979), Multivariate Analysis, NewYork: Academic Press.[35] Maronna, R. A. (1976), \Robust M-estimators of multivariate location and scat-ter," The Annals of Statistics, 4, 51{67.[36] Rousseeuw, P. J. (1983), \Multivariate estimation with high breakdown point,"Fourth Pannonian Symposium on Mathematical Statistics, Bad Tatzmanndorf,Austria.[37] Ser ing, R. J. (1980), Approximation Theorems of Mathematical Statistics, JohnWiley & Sons.35 [38] Stahel, W. A. (1981), \Breakdown of covariance estimators," Research report31, Fachgruppe fur Statistik, E.T.H., Zurich.[39] Tamura, R. N., and Boos, D. D. (1986), \MinimumHellinger distance estimationfor multivariate location and covariance," Journal of the American StatisticalAssociation, 81, 223{229.[40] Tanaka, Y., Watadani, S., and Moon, S. H. (1991), \In uence in covariancestructure analysis: with an application to con rmatory factor analysis," Com-munication in Statistics-Theory and Method, 20, 3805{3821.[41] Tyler, D. E. (1982), \Radial estimates and the test for sphericity," Biometrika,69, 429{436.[42] Tyler, D. E. (1983), \Robustness and e ciency properties of scatter matrices,"Biometrika, 70, 411{420.[43] Tyler, D. E. (1987), \A distribution-free M-estimator of multivariate scatter,"The Annals of Statistics, 15, 234{251.[44] Tyler, D. E. (1991), \Some issues in the robust estimation of multivariate locationand scatter," in Directions in Robust Statistics and Diagnostics: Part II, eds. W.Stahel and S. Weisberg, New York: Springer-Verlag, pp. 145{157.[45] Wilcox, R. R. (1994), \The percentage bend correlation coe cient," Psychome-trika, 59, 601{616.[46] Yuan, K.-H. (1995), Asymptotics for Nonlinear Regression Models with Applica-tions, unpublished Ph.D. dissertation, UCLA, Dept. of Mathematics.[47] Yuan, K.-H., and Bentler, P. M. (1995), \Mean and covariance structure analysis:theoretical and practical improvements," under editorial review.36 [48] Yuan, K.-H., and Jennrich, R. I. (1995), \Estimating equation asymptotics,"under editorial review.37
منابع مشابه
Robustified distance based fuzzy membership function for support vector machine classification
Fuzzification of support vector machine has been utilized to deal with outlier and noise problem. This importance is achieved, by the means of fuzzy membership function, which is generally built based on the distance of the points to the class centroid. The focus of this research is twofold. Firstly, by taking the advantage of robust statistics in the fuzzy SVM, more emphasis on reducing the im...
متن کاملStructure of Wavelet Covariance Matrices and Bayesian Wavelet Estimation of Autoregressive Moving Average Model with Long Memory Parameter’s
In the process of exploring and recognizing of statistical communities, the analysis of data obtained from these communities is considered essential. One of appropriate methods for data analysis is the structural study of the function fitting by these data. Wavelet transformation is one of the most powerful tool in analysis of these functions and structure of wavelet coefficients are very impor...
متن کاملAn Alternative Robust Model for in situ Degradation Studies “Korkmaz-Uckardes”
The first purpose of this study is to present an alternative robust model in order to describe ruminal degradation kinetics of forages and to minimize the fitting problems. For this purpose, the Korkmaz-Uckardes (KU) model, which has a logarithmic structure, was developed. The second purpose of this study is to estimate, by using the Korkmaz-Uckardes (KU)model, the parameters tp (the time to pr...
متن کاملEIGENVECTORS OF COVARIANCE MATRIX FOR OPTIMAL DESIGN OF STEEL FRAMES
In this paper, the discrete method of eigenvectors of covariance matrix has been used to weight minimization of steel frame structures. Eigenvectors of Covariance Matrix (ECM) algorithm is a robust and iterative method for solving optimization problems and is inspired by the CMA-ES method. Both of these methods use covariance matrix in the optimization process, but the covariance matrix calcula...
متن کاملRobust descriptive discriminant analysis for repeated measures data
Robust repeated measures discriminant analysis (RMDA) procedures based on parsimonious covariance structures were developed using trimmed estimators. The e ects of non-normality, covariance structure, and mean con guration on bias and root mean square error (RMSE) of RMDA coe cients were studied using Monte Carlo techniques. The bias and RMSE values of robust RMDA coe cients were at least 10% a...
متن کاملVariable Selection in Robust Joint Mean and Covariance Model for Longitudinal Data Analysis
In longitudinal data analysis, a correct specification of the within-subject covariance matrix cultivates an efficient estimation for mean regression coefficients. In this article, we consider robust variable selection method in a joint mean and covariance model. We propose a set of penalized robust generalized estimating equations to simultaneously estimate the mean regression coefficients, th...
متن کامل